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We describe algorithms, and experimental strategies, for the Pareto optimal control problem 
of simultaneously driving an arbitrary number of quantum observable expectation values to their 
respective extrema. Conventional quantum optimal control strategies are less effective at sampling 
points on the Pareto frontier of multiobservable control landscapes than they are at locating optimal 
solutions to single observable control problems. The present algorithms facilitate multiobservable 
optimization by following direct paths to the Pareto front, and are capable of continuously tracing 
the front once it is found to explore families of viable solutions. The numerical and experimental 
methodologies introduced are also applicable to other problems that require the simultaneous control 
of large numbers of observables, such as quantum optimal mixed state preparation. 
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I. INTRODUCTION 

The methodology of quantum optimal control has 
been applied extensively to problems requiring the max- 
imization of the expectation values of single quantum 
observables. Recently, an important new class of quan- 
tum control problems has surfaced, wherein the aim is 
the simultaneous maximization of the expectation values 
of multiple quantum observables, or more generally, con- 
trol over the full quantum state as encoded in the density 
matrix. Whereas various algorithms for the estimation 
of the density matrix have been reported in the litera- 
ture [l[ , comparatively little attention has been devoted 
to the optimal control of the density matrix or multiob- 
servables [2, 3, 4] . Such methodologies are important in 
diverse applications including the control of product se- 
lectivity in coherently driven chemical reactions |J], the 
dynamical discrimination of like molecules [1, Q , and the 
precise preparation of tailored mixed states. 

In a recent work we reported an experimentally 
implementable control methodology - quantum multi- 
observable tracking control (MOTC) - that can be used 
to simultaneously drive multiple observables to desired 
target expectation values, based on control landscape 
gradient information. It was shown that, due to the 
fact that the critical manifolds of quantum control land- 
scapes have measure zero in the search domain, track- 
ing control algorithms are typically unobstructed when 
following paths to arbitrary targets in multiobservable 
space. Therefore, these algorithms can facilitate multi- 
observable control by following direct paths to the target 
expectation values, and are typically more efficient than 
algorithms based on optimization of a control cost func- 
tional. The MOTC method falls into the general cate- 
gory of continuation algorithms for multiobjective opti- 
mization [sj. These were introduced several years ago 
as alternatives to stochastic multiobjective optimization 
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algorithms, which do not exploit landscape gradient in- 
formation. 

In this paper, we extend the theory of MOTC to 
complex multiobjective optimization problems such as 
multiobservable maximization, and propose experimen- 
tal techniques for the implementation of MOTC in such 
complex multiobservable control scenarios. Multiob- 
jective maximization typically seeks to identify non- 
dominated (rather than completely optimal) solutions, 
which lie on the so-called Pareto frontier Con- 
ventional multiobjective control algorithms require sub- 
stantial search effort in order to adequately sample the 
Pareto frontier and locate solutions that strike the de- 
sired balance between the various objectives. Direct 
application of optimal control (OC) or MOTC al- 
gorithms is also inefficient in other applications, such 
as quantum state preparation, that require control of a 
large number m of observables. 

When naively apphed, both OC and MOTC fail to 
exploit the simple underlying geometry of quantum 
control landscapes [7|, by convoluting a) the dynamic 
Schrodinger map e{t) i— > U(T) between time-dependent 
control fields and associated unitary propagators at the 
prescribed final time T with b) the kinematic map 
U{T) 1— » $(C/(T)) between unitary propagators and as- 
sociated observable expectation values. In multiobserv- 
able Pareto optimal control, the Pareto frontier on the 
domain U{N) has a simple geometric structure, which 
can be exploited to efficiently sample the corresponding 
frontier on the domain e{t). Here, we develop strategies 
that combine the application of MOTC with quantum 
state estimation and kinematic optimization on IA{N) 
for the purpose of solving Pareto optimal control and 
other large m observable control problems. 

The paper is organized as follows. In Section II, we 
provide necessary preliminaries on Pareto optimal con- 
trol and quantum multiobservable control cost function- 
als. Section HI presents analytical results on the dis- 
tribution of Pareto optima in quantum multiobservable 
maximization problems. In Section IV, we describe how 
MOTC can be used to locate corresponding points on 
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the dynamic Pareto front and to subsequently explore 
families of fields within the front that minimize auxiliary, 
experimentally relevant costs. Section V proposes mea- 
surement strategies aimed at improving the experimen- 
tal efficiency of MOTC. In Section VI, we numerically 
implement Pareto optimal tracking control and illustrate 
the use of efficient measurement strategies for difhcult 
problems with large numbers of observables to. We also 
examine the impact of prior state preparation and the 
advantages of using MOTC versus scalar cost function 
optimization for Pareto front sampling. Finally, in Sec- 
tion VII, we discuss the implications of these results for 
Pareto optimal control experiments. 



II. PARETO OPTIMIZATION OF QUANTUM 
CONTROL COST FUNCTIONALS 



The class of quantum optimal control problems we 
examine in this paper in can be expressed [3| : 



max mm). 

e(t) 



(1) 



U{T) is an implicit functional of via the Schrodinger 
equation for the unitary propagator 



dU{t) 
dt 



-H{e{t))U(t), U{0) - In, 



where H is the total Hamiltonian, and e{t) is the time- 
dependent control field. Solutions to the optimal control 
problem correspond to = for all t G [0,T]. In 

quantum single observable control, $ is typically taken 
to be the expectation value of an observable of the sys- 
tem: 



$(C/(T)) = Tr{U{T)p{0)U\T)e), 



(2) 



where p(0) is the density matrix of the system at time 
t — and O is a Hermitian observable operator whose 
expectation value we seek to maximize 11| . Recent work 
has demonstrated that the landscape for optimization 
of ([2]) is devoid of local extrema. 

Although the goal of a multiobjective optimization 
problem may be to maximize the expectation values of 
all observables, i.e., 



max $(C/(T)) 

£(*) 



{$i(C/(T),... ,<f„(C/(T))}, 



in many cases the cannot be simultaneously max- 
imized. Thus, the scalar concept of optimality must 
be replaced by the notion of Pareto optimality A 
control field e*(t) is said to be strongly Pareto optimal 
if all other fields £{t) have a lower value for at least 
one of the objective functions $fc(-)j or else have the 
same value for all objectives. e*(i) is weakly Pareto 
optimal if there does not exist another field e(t) such 
that <i>fe(£(i)) > <I>fe(e*(t)) for all k. Analogous defini- 
tions hold for strongly and weakly Pareto optimal uni- 
tary propagators U*(T), which we refer to as kinematic 



Pareto optima in order to distinguish them from the 
aforementioned dynamic Pareto optima. Fig. 1 pro- 
vides examples of strong and weak kinematic Pareto op- 
tima. We denote the set of strong (kinematic, dynamic) 
Pareto optima by 'p^^''^\ and the set of weak Pareto op- 
tima by 'Pin''''\ We will use the term Pareto frontier (or 
Pareto front) to refer to 7?]'^''^-' or vH/'^^ Our primary 
interest in this paper is the development of algorithms 
for sampling weak Pareto optima, which are much more 
numerous in quantum control problems than are strong 
Pareto optima. Note that it is possible to define the no- 
tion of Pareto optimality for more general cost functions; 
in particular, selected observable expectation values may 
be minimized through the replacements <I>i:(-) — $fc(-). 

A natural scalar objective function $ for multiobserv- 
able control is a positively weighted convex sum of the 
individual observable objectives, i.e.. 



(3) 



fe=i 



where $fe = Tt{U {T)p{Q)U\T)Q^), A: = 1, 2, • ■ • , to. 
Within the electric dipole formulation, where the Hamil- 
tonian assumes the form 



H{t) = Ho-p- e(t) 



(4) 



with internal Hamiltonian Hn and electric dipole opera- 
tor fi, the gradient of $m is 



5$ 



M 



5e{t) 



■^afcTr{[efe(T),Mi)]p(0)}, (5) 



where ^.{t) = W (t) pU (t) . One approach to quantum 
Pareto optimal control is to employ gradient flow algo- 
rithms of the form 



ds 



7 



6^ 



M 



Ses{t)'' 



(6) 



where 7 is an arbitrary positive constant and s is a con- 
tinuous variable parameterizing the algorithmic search 
trajectory, to maximize an objective function or the form 
([3]). In order to sample the Pareto front, various set- 
tings of the convex weights in ^ are typically sam- 
pled in independent optimizations p^ . Note that the 
objective function <I>m is just the expectation value of 
a single quantum observable 9m = X^feLi "^fc^fc- The 
absence of local traps 0] in control landscapes for the 
single observable objective function ^ thus indicates 
that the multiobservable gradient flow will also reach its 
global optimum ^. However, it is not immediately clear 



^ It is possible to construct multiobservable objective functions 
that have a more general nonlinear form, such as those that 
have been used in optimal dynamical discrimination (^. These 
objective functions may possess diverse landscape topologies. 
Nonetheless, the most commonly employed multiobjective func- 
tions are devoid of traps. 
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whether the multiobservable gradient flow ^ converges 
to points on the Pareto front, since the global optima 
of ^ ioT Q = 8fc and ^ do not necessarily intersect. 
The same question arises for stochastic optimization of 
([3]). We now examine the conditions under which ^ 
will converge to a weak Pareto optimum. 

III. IDENTIFICATION OF PARETO OPTIMA 

In order to understand the structure of the set V^^ 
of weakly Pareto optimal points, it is necessary to first 
characterize the global optima of the individual objec- 
tives. The condition for a unitary propagator [/ to be a 
critical point of objective function ([2]) is p^ : 

[ek,up{o)u^ = o. (7) 

Let p(0) and Qk be diagonalized according to p = 
R^p{0)R, Qk = SlQkSk, such that 

p = diag{ A(i), A(i); A(r), A(r)} (8) 

Gfc = rfja.g{7fc(i), • • • ,7fc(i); • • • ;7fc(sfc)' ■ ■ ■ ''^fe(sfc)K9) 

where the A(i), Jk{i) a-re distinct eigenvalues of p(0), Qk 
with multiplicities ni, • ■ • , rir and ruki, • ■ ■ , ti^s^ , re- 
spectively. Further, set IJ = R^USk- We choose R such 
that A(i) < • • • < A(j.) and Sk such that 7fe(i) < • • • < 
Ik(sk)- A diagonal matrix of eigenvalues satisfying this 
condition is said to be "arranged in increasing order" . 

It was shown in [l3| that the critical submanifolds A^^ 
that satisfy equation ([7]) for each objective $fe may be 
expressed in terms of U as the double cosets 

Ml^UinXUixUk), (10) 

where U{mk) is the product group U{mki) x ••• x 
U{mksk )i each lA{mki) corresponding to an eigenvalue of 
Qk with TOfcj-fold degeneracy (respectively U{n),U{ni)x 
■ ■ ■ xU{nr), p(0), ni), and tt* denotes a permutation ma- 
trix on N indices. We denote by diag{j'^^, ■ ■ ■ , 7^jv} = 
n'^QkiT the diagonal matrix of (possibly degenerate) 
eigenvalues resulting from application of a permutation 

TT to Qk- 

If two permutation matrices tt^, tt^ satisfy the relation 

7r2 = [/7rV, {U,V) eU{n) xU{mk), (11) 

then they are associated with the same critical submani- 
fold, and we write tt^ ^ tt^. By this equivalence relation, 
we can partition the permutation group 'D{N) into non- 
intersecting subsets: 

p(A^) = p^u•■•upf^ (12) 

where each subset is an equivalence class with respect to 
~ and uniquely corresponds to a critical submanifold; dk 
is the number of critical submanifolds. For simplicity, 
we always let 2?™^'^ — D], correspond to the optimum 



manifold A^™'*'^, whose elements arrange the eigenvalues 
of in increasing order. 

Analytical results concerning the distribution of 
Pareto optima are most readily derived in the case where 
the observables {6fc} commute. In this case, they can be 
simultaneously diagonalized by a single unitary trans- 
formation S. Because the partition (fT^ for each k is 
complete, each V], must overlap with at least one 2?^,. 
Hence, the critical manifolds IJ- M\ and IJ^- M.\, of 
$fc and respectively, must intersect. The critical 
sw&manifolds M]. and TW^,, intersect if the correspond- 
ing VI n V{, ^ 0. 

Based on the above result, we can specify the relation- 
ship between the maximum of the function <I>m(C^) and 
the critical submanifolds of the single observable cost 
functions <I'fc([/) in the case that the {Qk] commute. 
Let Qm = El'Li "fcOfc, let V{N) =Vl,U---U V^f be 
the corresponding partition of permutation group, and 
let S denote the unitary transformation that arranges 
Q]\i in increasing order. Then we have the following 
lemma pertaining to the conditions for convergence of 
gradient flow ([6]) to a Pareto optimum. 

Lemma 1 Let {Qk} be a set of mutually commuting 
observables. The gradient flow (0j may (depending on 
initial conditions) converge arbitrarily close to a weak 
Pareto optimum if for at least one k, V™^^ fl ^ 0. 

The flow is guaranteed to converge to a weak Pareto opti- 
mum if for each Uimj^n) we have S'lS U{niMi) Sk C 
U{mkj) for some j, and 2?"?'' ^ P™""". The flow may 
converge arbitrarily close to a strong Pareto optimum if 

pmax p pmax ^ ^ p^^ g^^^ fl^^ 

converge arbitrarily close to the critical submanifold Ail. 

o/$fc ifvinv^-^^H). 

The proof for guaranteed convergence to V^'^^ is pre- 
sented in Appendix A. The remaining claims follow di- 
rectly from the arguments above. 

Now consider the problem of solving for the coeffl- 
cients {ak{ such that the gradient flow is capable of con- 
verging to an arbitrarily specifled combination of Z < to 
critical submanifolds of the respective observable oper- 
ators, i.e., 

^k{U)^x'k'. x'k'^Ck, l<k<l, (13) 
where x]^ denotes the ifc-th critical value in the set Ck of 
critical values of $fc(C/). Let us write Qm = S'^QmS = 
d-iag{Y.k'^klli, - ■ ■ :SfcQ!fc7fcAr}: where Hk is the per- 
mutation matrix that reorders the diagonal elements of 
Qk ~ SlQkSk so that they are arranged in the order 
induced by S, i.e., S — TT^SkT^k- Provided that we have 

vY n---r] P;" ^ 0, (14) 

then for each permutation matrix vr G T>Y H • • • H P;' , 
there is an independent system of A'^! — 1 inequalities: 

TV m 

E E {^kj - Ik'j) > 0, V^' / TT. (15) 

j=l k=l 
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It is possible to satisfy conditions ([T^ at the global op- 
timum of the objective function ([3]) if a solution to this 
set of inequalities exists for at least one such tt. We 
therefore have the following theorem. 

Theorem 1 Consider a quantum Pareto optimization 
problem of the form \13]) involving mutually commuting 
observables {Ok}- 

1. It is possible to design an objective function (0j 
whose gradient flow may (depending on initial 
conditions) converge arbitrarily close to a weak 
Pareto optimum corresponding to Xk ~ XT'^^ 
while arranging the expectation values of observ- 
ables Qk' , k' ^ k in the order designated by equa- 
tion ilSp . if a solution to the inequalities I115\) ex- 
ists for at least one vr £ TT^ fl • ■ • H T>\^ . 

2. Similarly, it is possible to minimize the expectation 
value of observable Qk if a solution exists for Xk = 

^min 

3. If a solution exists for the choice {Xk'} — {x'k'^^}' 
then it is possible to design an objective function 
(0j whose gradient flow converges arbitrarily close 
to a strong Pareto optimum. 

The results of Theorem 1 pertaining to convergence to 
Pareto optima hold also for stochastic search algorithms 
that optimize a conventional multiobjective cost func- 
tion of the form 

The relative likelihoods of the gradient flow ^ con- 
verging to various Pareto optima can be qualitatively 
assessed by comparing the dimension of M^^f"^ to the 
dimension of the intersection J]^^^'"''''' = TW^^f'^nA^^^'^ 

for each observable k whose Xk = X'k'^^ equation 
These dimensions can be bounded analytically, accord- 
ing to a method described in Appendix A. 

In the case that the {6^} do not commute, the critical 
submanifold Ad], of observable Qk does not necessarily 
intersect submanifold A4l., of an observable Qk/ that is 
diagonal in a different basis even if VI n V-j,, 7^ 0. In this 
case, a simple analytical criterion analogous to Lemma 1 
for assessing whether ^ converges to a Pareto optimum 
does not exist, but a method presented in Appendix A 
may be used to determine the maximum possible dimen- 
sion of the intersection sets. 

The most common technique for sampling the Pareto 
front of multiobjective control problems is to run many 
independent maximizations of a cost functional like ([3]) 
on the domain e{t), using different coefficients {ak}- 
The results of Lemma 1 and Theorem 1 indicate con- 
ditions under which this strategy may be successful in 
quantum Pareto optimal control, and demonstrate that 
it may not always constitute a viable method. Even 
when the global optimum of a multiobjective cost func- 
tion intersects the Pareto front, it is not necessarily ad- 
visable to employ such a function for Pareto front sam- 
phng on s{t), due to the considerable expense of each 
optimization on this high-dimensional domain 



An alternative, generic approach to accelerating 
Pareto front sampling is to employ stochastic search al- 
gorithms that do not rely on optimization of a scalar 
objective function. In recent years, several such multi- 
objective genetic or evolutionary algorithms (MOEAs) 
- including the elitist nondominated sorted genetic al- 
gorithm (NSGA-I I) p^ . the Pareto-archived evolution 
strategy (PAES) llal and strength-Pareto evolutionary 
algorithm (SPEA) [13 - have been designed for the prob- 
lem of multiobjective Pareto front sampling. These algo- 
rithms successively sort populations based on nondomi- 
nance. The NSGA-II algorithm was applied successfully 
to the two-objective problem of optimal quantum dy- 
namical discrimination While promising, MOEAs 
share several drawbacks, most notably a computational 
complexity that scales as 0{md'^) or O(md^), where d 
denotes the population size. For problems involving con- 
trol of a large number of observables, these algorithms - 
in addition to being less precise than deterministic algo- 
rithms - become very expensive since large population 
sizes are required to adequately sample the front. 

Due to the fact that the quantum control cost func- 
tional ([2]) are explicit functions of the final unitary 
propagator U{T), but only implicit functions of the con- 
trol field e{t), there exists a convenient alternative to 
either of the above Pareto optimization strategies based 
on a hybrid approach that combines kinematic and dy- 
namic sampling. Assuming full controllability (Tsj , there 
is at least one e{t) in 'Pfy that maps to any given U{T) 
in P^. The distribution of T'^ in the domain U{N) 
is determined solely by the properties of p{Q) and the 
{Qk}- Therefore, provided that the initial density ma- 
trix p(0) is known, it is possible to sample Pareto optima 
on U{N)] this sampling on a finite-dimensional space 
will be much more efficient than direct sampling of V^j . 
Once target points in V}^ have been numerically identi- 
fied, one may experimentally track a direct path to the 
corresponding points in P^, using techniques that will 
be described in Section HVl 

For example, many different coefficient sets {otk} may 
be rapidly sampled using the kinematic gradient flow [J| 
of the multiobservable objective function Q, namely 

ATT ™_ 

— = ^afc[efe,C/p(0)(7t]C/, (16) 

k=l 

instead of the dynamic flow ([5]). In the case that this 
flow converges to a (weak) Pareto optimum, it is a very 
convenient means of sampling segments of the front due 
to its speed. Of course, single observable kinematic gra- 
dient flows may always be used to locate a weakly Pareto 
optimal point, but do not provide the freedom to prefer- 
entially weight the importance of the other observables. 
Dividing the m observables into mutually commuting 
sets and applying the weighted flow (fTO)) for each set, 
with the {oLk} determined by solving the system of in- 
equalities pS]) . provides a simple and rapid means of 
sampling the (weak) Pareto optima associated with a 
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specified combination (|13p of multiple expectation value 
constraints. 

In the more general case where the {Ofc} cannot be 
conveniently subdivided into mutually commuting sub- 
sets, the analytical results above on the distribution 
of Pareto optima are not immediately useful, but it is 
straightforward to determine numerically whether cer- 
tain types of weak and strong Pareto optima exist. 
Given an estimate for p{0), and the expectation val- 
ues of m observables at time T specified as the control 
targets, the submanifold of unitary propagators U{T) 
within which the actual propagator may lie is defined 
by the system of equations 

Tr((7(r)p(0)[/t(r)efe) = Xfc, fc = i,...,m, (17) 
[uHT)U{T)]^^ ^ <5,„ z,j = l,...,7V,(18) 

where Xk is the desired expectation value of 6fe. If p(0) 
is full-rank and nondegenerate, the rank of this system 
of equations in the 2iV^ real variables Re (C/y ) , Im (Uij) 
is maximal, resulting in the smallest possible dimension 
of the solution submanifold in U{N). Note that this 
system of equations is overdetermined if 

■m>2Nn~n^, 71 = rank p(0), (19) 

provided {O^} is linearly independent. For example, if 
p(0) is a pure state, parameterized by N complex co- 
efficients subject to a normalization constraint, then no 
more than 2 — 1 independent observables can be simul- 
taneously driven to arbitrary expectation values. Multi- 
observable controllability amounts to controllability over 
a subset of the independent parameters of p(0). 

It is convenient to apply the principle of entropy max- 
imization (PEM) in choosing a target W from the solu- 
tion submanifold, since this identifies a propagator that 
is most likely, from the perspective of statistical uncer- 
tainty, to be the real propagator of the system given the 
measurements made during multiobservable control. In 
this approach 19], the von Neumann entropy 

S{p) = Tr(plogp), p{U{T)) = U(T)p{Q)U\T), (20) 

is maximized on U{N) subject to the constraints pT| . 

The following protocol then constitutes a general 
strategy for kinematic sampling of Pareto optima: 1) de- 
termine the maximum and minimum expectation values 
of each of the individual observables (by simple matching 
of the eigenvalues of p{Q), Ofc); 2) choose putative sets of 
observable expectation values {xfc} within these respec- 
tive ranges (setting I < m, I = m oi these expectation 
values to x™^^ foi' weak Pareto optima, strong Pareto 
optima respectively); 3) maximize the von Neumann 
entropy (using, for example, the simplex or Newton- 
Raphson algorithms with the constraints imposed as La- 
grange multipliers), assuming the system of equations 
p7|) above is consistent. 

In the next section, we describe experimentally im- 
plementable algorithms for the optimization of multiple 



observable expectation values on the domain of controls 
£(t), which can be used in conjunction with the above 
kinematic algorithms to efficiently sample the dynamic 
Pareto front. 



IV. PARETO OPTIMAL TRACKING 
CONTROL 

Once the target observable expectation values Xki k = 
1, • • ■ , TO on the Pareto front have been determined, it is 
possible to identify corresponding Pareto optimal con- 
trol fields e*{t) through minimization of an objective 
function of the following form: 

m 

*A/(C^)=E"'»l*'»('^)-^fcl'' (21) 

k=l 

over e(t), instead of maximization of ([3]). However, as 
described in [1], gradient flow algorithms of this type 
can be inefficient, especially when the rank of the ini- 
tial density matrix p(0) is high, as it may be in large 
molecular systems or at elevated temperatures. More- 
over, it is inconvenient to employ optimization of (j2ip 
to systematically investigate families of fields e*{t) that 
differ in other experimentally relevant dynamic proper- 
ties. In this section, we review and extend the more 
versatile methodology of multiobservable tracking con- 
trol (MOTC), which drives the expectation values of m 
observable operators along predetermined paths to the 
target {xfe} H. Once an e*{t) has been found, MOTC 
can be used to continuously trace the dynamic front to 
identify solutions that minimize the expenditure of con- 
trol resources. 

We assume, without loss of generality, that the ob- 
servables Oi, Om measured at each step of the search 
are linearly independent. Let 'I's,--- , <I'™ denote the 
expectation value paths (functions of an algorithmic pa- 
rameter s) for each observable, which together constitute 
a desired path Wg of $ in the vector space of multiob- 
servable expectation values. Then, the change in the 
control field along the track must satisfy the fol- 

lowing condition: 

ds Jo 5es{t) ds ds ■ ^ ' 

where $i = Tr(C/(r)p(O)f/^(r)0,) This is a Fredholm 
integral eqn of the first kind for the unknown partial 
derivative given es{t) at s and all t G [0, T]. To 

solve for the algorithmic fiow ^^fj^ that tracks Wg, it is 
necessary to expand it on a basis of independent func- 
tions. It is convenient to make this expansion on the 
basis of the independent observable expectation value 
gradients: 

j=i 
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Fig. 1: Quantum Pareto optimal control aims to maximize the expectation values (Qk), k = 1, - ■ ■ ,m (m = 3 in the present 
case). For each observable, the expectation value is maximized on a continuous submanifold A^^™''''' of unitary propagators 
U{T), the dimension of which is determined by the rank and eigenvalue degeneracies of the initial density matrix p(0) and Qk- 
The observables can be simultaneously maximized if the intersection M^^^^^^ is nonempty and a point in the intersection 
can be reached under some time-dependent field e{t) (left panel). This intersection is a subset of the set of strongly Pareto 
optimal propagators , also called the Pareto frontier. Depending on p(0) and {6fe}, some problems may not have a 
completely optimal solution, and may involve a tradeoff between maximization of different observables. The set of weakly 
Pareto optimal propagators consists of the union of intersections of the various TVf^™^'''' with level sets {U{T) \ {Qk{T)) = Xk} 
of other observables (right panel). Optimization of multiobjective scalar cost functionals on the domain e{-) (for example 
through gradient flows) does not generally lead to Pareto optima. Pareto optimal tracking control involves computational 
identification of points in or and subsequent experimental tracking of a direct path to that manifold on e(-). 



where fs{t) contains the degrees of freedom outside the 
subspace spanned by ^j^y- As described in [1], fs{t) 
can take on a variety of forms depending on auxiliary 
dynamical costs on e(t) that are to be minimized. Insert- 
ing the expansion ([23]) into the integral equation ((22)) . 
we have 



&s(i) fcs(i) 



6<i>i dwt 

dt = 



ds 



Ses{t) 



fs{t)dt. 



Defining the m-dimensional vector SLs(t) 
{alit),- ■■ ,aT{t)) hy 



= aTTTY = -TTr(p(0)[[/t(r)e.C/,(T),Ai,(i)]) , 



Ses{t) 

and the MOTC Gramian matrix Tg by 

(r,),, = r alit')alit')dt', 



(24) 



it can then be shown [J] that the initial value problem 
for es{t) given eo{t) is: 



ds 



fs{t) 



dw 

d7 



sL,it')Mt')dt' r;ia,(t). 

(25) 

This expression can be integrated either numerically (us- 
ing standard methods described in [i]) or experimentally 



(Section V) to track the desired path to the dynamic 
Pareto front. The invertibility of the Gramian matrix 
Fs corresponds to the ability to move from the point 
$5 in multiobservable space to any infinitesimally close 
point $s+ds through an infinitesimal change 6es{t) in 
the control field ,41] . With minor modifications the 
MOTC tracking differential equation ([25]) can also be 
used to carry out multiobjective optimizations for non- 
observable objectives. 

The MOTC algorithm can be applied to follow any ar- 
bitrary vector track Wg of observable expectation values. 
It is convenient to choose Wg based on the correspond- 
ing set of possible paths Us{T) in the domain U{N) of 
unitary propagators, since the latter contains the most 
information about the dynamics of the system at time T. 
Let Cs = {Vs 1 <I>/c(T4) = k = l,...,m} denote the 
set of propagators that map to the observable track at al- 
gorithmic time s. Denote by the system of equations 
(I17p with {xfc} replaced by w^. Assuming this system 
is consistent, the dimension of Cs is equal to iV^ less 
the number of independent equations in Bs- As such, 
the path Us{T) followed by the optimization algorithm 
in U{N) will, on average, be more similar to a target 
unitary track Qs, for a greater number m of orthogonal 
observables (note dim Cs also depends on the eigenvalue 
spectra of p(0) and {0^}). As a function of the algorith- 



mic step s, the term 



^s{t')fs{t')dt' 



m equa- 
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tion ([25)1 will adjust the step direction so that the possi- 
ble unitary propagators Vs at each step are constrained 
within the subspace Cs (provided T is invertible; see Sec- 
tion VI). In Section |Vl we describe special choices for 
that are globally optimal from the standpoint of mean 
distance traveled in U{N). 

An alternative MOTC-based approach to identifying 
weak dynamic Pareto optima is to maximize the expec- 
tation values of successive observable operators while 
holding those previously maximized at constant values. 
This requires successively applying m MOTC searches 
of dimensions 1, • • • , to, respectively; these searches may 
be carried out in an order that reflects the relative im- 
portance of the observable objectives. Let (pi, ■ ■ ■ ,Pm), 
where p is a permutation on m indices, denote this order 
and let x^'' denote the conditional maximum of <I>^* (e(i)) 
given <^P^[e{t)) = ^ < j < i- This approach then 
amounts to successive integrations of (j25|) with 



' S 



.X' 



Pr-1 



,m, 



(26) 
(27) 



where is any monotonically increasing function of s. 
For maximization of m observables, there are m\ such 
combinatorially distinct control strategies. 

The final observable expectation values {xk} will sel- 
dom be the only properties of a control solution that are 
relevant experimentally. In general, other dynamic prop- 
erties of the field, such as its total fluence e'^{t)dt, will 
have implications for the feasibility of a control experi- 
ment. Hence, the ability to continuously traverse a sub- 
manifold of the dynamic Pareto front, once it has been 
found, may be essential for identifying a Pareto optimal 
control e*(t) with the most desirable physical properties. 
There generally exists an infinite number of control field 
solutions corresponding to any given combination of fi- 
nal observable expectation values {xfe}, as evidenced by 
the presence of the free function fs{t) in equation ((25)) . 
The MOTC algorithm offers a systematic means of ex- 
ploring these families of solutions through specification 
of fs{t)- To maintain the objective function values {^k} 
at their target values {xk} while sampling fields corre- 
sponding to a free function /s(i), the following initial 
value problem for Eg (t) is solved: 



ds 



fs{t) 



a,(t')/.(t')di' 



r;ia,(t). (28) 



With the observable expectation values fixed, different 
choices of the free function fs{t) correspond to differ- 
ent trajectories through the associated portion of the 
dynamic Pareto front. The final unitary propagator 
U(T) may change during such excursions. Choosing 

fs{t) = — , where S{t) > is an arbitrary weight 

function (e.g., a Gaussian) and 77 is a constant that 
controls numerical instabilitieSjWill minimize fluence at 
each algorithmic time step s [23|. Note that equation 



can also be used to identify fluence- minimizing con- 
trols for other multiobservable control problems like op- 
timal state preparation. 

The connectedness of the submanifold of corre- 
sponding to the expectation values {xfc} determines 
which e*{t) can be accessed by application of (l28)) . 
It has been shown [2l| that the kinematic level sets 
{f/(r) I $fc(C/(T)) = Xk} of single observables are 
connected manifolds. However, the dynamic level sets 
{sit) j $fc(e(t)) = Xk} may under certain circumstances 
|22 | be disconnected. Moreover, it is possible that the 
intersections of the kinematic level sets of multiple ob- 
servables (i.e., the solutions to the system of equations 
(fT7)) ) consist of disconnected submanifolds. If this in- 
tersection is disconnected, it follows that the dynamical 
intersection set is also disconnected (22j . 

A disadvantage of the purely dynamic Pareto front 
sampling strategy (|26p is that since the manifold of con- 
trol flelds {e{t) I $fc(e(t)) = Xfc, 1 < fc < m} to which 
the algorithm converges may in general consist of discon- 
nected submanifolds, certain classes of solutions may not 
be accessible from a canonical starting field e* {t) by the 
homotopy tracking algorithm (|^5)) . By contrast, sam- 
pling of the kinematic Pareto front by the methods de- 
scribed in Section IIIII can (at the expense of additional 
initial overhead required for state estimation) identify 
propagators U* (T) that lie in different disconnected sub- 
manifolds of the front, such that (|28| can subsequently 
be applied to reach any control field in the local vicinity 
of the canonical field e* (t) obtained from application of 
(|25)l . On the other hand, a virtue of the method (|26)) 
is that it is also applicable to quantum Pareto optimal 
control problems other than observable maximization, 
where kinematic identification of Pareto optima may not 
be possible. 

Maximization of (|2T)) or of ([3]) (assuming {ak} are 
restricted to the ranges stipulated by the system of in- 
equahties (fT^ ). with various coefficient sets {ak}, will 
also lead to distinct points on the dynamic Pareto front 
with the same observable expectation values but differ- 
ent U{T) and/or dynamic properties of the field. How- 
ever, such a strategy is far less efficient than applying 
equation (|^5)) , since a separate optimization must be car- 
ried out for each set of coefficients. Further mathemat- 
ical details on the formal correspondence between these 
two approaches, from the perspective of differential ge- 
ometry, may be found in the treatise Q. 

The observables {9^} of interest are not necessar- 
ily orthogonal. However, the statistical efficiency of 
MOTC (i.e., the tracking accuracy for a given number 
of measurements) is easiest to assess when orthogonal 
observables are measured at each step. For simplicity, 
we represent each of the Hermitian matrices as an A^^- 
dimensional vector with real coefficients. Then Gram- 
Schmidt orthogonalization offers a convenient means of 
constructing an orthogonal basis of m linearly inde- 
pendent A^^-dimensional vectors, &[{T), 0^(T) that 
spans the associated subspace - any element of the set 
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{8fe} can be expressed as a linear combination of the ba- 
sis operators in this set, i.e., for any k, Qk = X^I^i Cfci0i- 
When the desired tracks of the target observable expec- 
tation values are thus expanded on a set of experimen- 
tally measured orthogonal observables, we have 

for the gradient of each of the observable expectation 
values (Ofe) with respect to the control. 

V. EXPERIMENTAL IMPLEMENTATION 

This section discusses methodologies for the labora- 
tory implementation of MOTC and Pareto optimal con- 
trol. Laboratory implementation of quantum optimal 
control is typically carried out in open loop with numer- 
ical search algorithms guiding the experimental control 
field updates (e^, via a laser pulse shaper) at each al- 
gorithmic step [10]. 

A. Initial state estimation 

In order to kinematically sample the Pareto front as 
described above, it is necessary to have an estimate for 
the initial density matrix p(0). In the case that p(0) 
is not a thermal mixed state, it can be determined by 
the method of maximal likelihood estimation (MLE) of 
quantum states ^Ij. Here, the likelihood function 

n 
1=1 

where Ti denotes a positive operator valued measure 
(i.e., one of a complete set of — 1 orthonormal ob- 
servable operators) corresponding to the outcome of the 
i-th measurement, describes the probability of obtaining 
the set of n observed outcomes for a given trial density 



matrix p. This likelihood function must be maximized 
over the set of admissible density matrices. An effec- 
tive parameterization of p is the Cliolesky decomposition 
/5(0) = T^T - where T is a complex lower triangular 
matrix with real elements along the diagonal - which 
guarantees positivity and Hermiticity. The remaining 
condition of unit trace is imposed via a Lagrange multi- 
plier A. Maximization of the fimction 

n 

C{f) = ^lnTr(f^TJ^,) - ATr(T'^f), 

with the choice X = n for the Lagrange multiplier, guar- 
antees normalization of p at the MLE estimate [l| . Stan- 
dard numerical techniques, such as Newton-Raphson or 
uphill simplex algorithms, are used to search for the 
maximum of C over the iV^ parameters of the matrix 
f . 

B. Minimal- length Ws and error correction 

There are many possible choices for the track that 
leads to a point W in P^. However, it is desirable to 
choose paths that are globally optimal from a geometric 
perspective. The Riemannian Bures metric on the space 
of density matrices [23| could be used for this purpose, 
but this metric cannot be expressed in compact form for 
arbitrary Hilbert space dimensions, especially for arbi- 
trary TO-dimensional subspaces of the Hilbert space. We 
thus adopt the more convenient approach of defining the 
target multiobservable track as the Ws that maps to the 
geodesic Qs in U{N) between any Uq consistent with the 
initial control guess and W, i.e., 

w^ = ^cfc,;Tr{p(o)gte^g,}, (30) 

i 

where Qs — Uoexp{iAs) with A = — z log(ly''C/o)• 
Error-correction methods can be applied to account 
for deviations from the track Wg. These generally in- 
volve the addition of a correction term Cg to the tracking 
differential equation ([^5]) such that 



desjt) 

ds 



fs{t) 



dwg 
ds 



as{t')fs{t')dt' 



r7'a,(i), 



(31) 



where may be taken to be a scalar multiple of the 
difference between the actual vector of observable ex- 
pectation values and the target track, i.e., 

cs=(3{ws-$s), f3>0. 

Note that since this error-correction method requires 
only estimation of the expectation values of the m ob- 



servable operators it is straightforward to imple- 

ment in an experimental setting. 

Runge-Kutta (RK) integration of the MOTC differ- 
ential equation pip can further decrease tracking errors 
by employing derivative information at multiple points 
across the algorithmic step, instead of only at a single 
point. In principle, RK may be applied experimentally 
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at a substantially lower cost than including second order 
functional derivatives in the tracking differential equa- 
tions, since the latter requires estimation of the Hessian 
matrix of the observable expectation values with respect 
to the control. Moreover, although it is possible to im- 
plement second-order MOTC numerically, accurate ex- 
perimental estimation of the Hessian is difficult due to 
the inevitable presence of experimental noise. Thus, RK 
integration is the preferred method for further improve- 
ment of the experimental accuracy of MOTC algorithms. 



C. Choice of measurement bases 

Gram-Schmidt orthogonalization, as discussed in the 
previous section, is a convenient method for obtaining 
a canonical orthogonal observable basis for MOTC, but 
from the point of view of experimental overhead, it is 
generally not statistically efficient to measure these ob- 
servables. The complete set of iV^ — 1 possible orthogonal 
observables can instead be expanded on iV -t- 1 orthonor- 
mal measurement bases. Here, we use mutually unbi- 
ased measurement bases (MUB) [2^, which are known 
to provide tight confidence intervals on the expectation 
values of a set of multiple observables for a finite number 
of measurements (the advantage in statistical efficiency 
increases with m) . The expression for MUB differs based 
on the Hilbert space dimension N. When N is the power 
of an odd prime (which encompasses all cases considered 
in the current work), these bases V^"^"^ are given by 



diag{<7i,--- ,aN}. The multinomial distribution param- 
eters pi, ...,pn-i are then formally given by 



yir) 



r = 



1 exp[2^(rg2+pq)], l<r<N. 



The orthonormal observables 8' in equation (|29p can 
then be taken to be 

Q'riN-l)+^ = ^('^)e:(yW)t, (32) 

6^ = =dja5{0,...,l,...0}, (33) 

for l<i<N— l,Q<r<N. However, the orthonor- 
mal observables in these bases need not be projection 
operators. 

Every quantum measurement in a basis V'^^^ pro- 
vides information regarding the parameters pi, ...,pn of 
the multinomial distribution corresponding to that ba- 
sis {N — 1 of these may be considered parameters of 
p(T)). The measurement of an observable O with s dis- 
tinct eigenvalues returns s — 1 independent parameters 
from the set. Control of multiple observables can be 
achieved for the lowest experimental cost by measur- 
ing nondegenerate observables in the bases of interest. 
Moreover, tight bounds on those gradient estimates will 
be obtained if the measurement bases are mutually un- 
biased. Therefore, for each MUB basis, assume a full- 
rank, nondegenerate observable O is measured that is 
diagonal in that basis, i.e., 9 ^ y('-)E(y('-))t, S = 



= Tr (p(T)yWeai^( 



0^t 



1< i < - 1, 



with the remaining p^ = 1 — 'l2iLi^ Pi- These corre- 
spond experimentally to the frequencies with which the 
measurement outcomes ct^ are observed. Assuming for 
simplicity that the measurement basis V^^'^ has been cho- 
sen such that it coincides with a basis within which one 
of the m target observables &k is diagonal (extension to 
the more general case is straightforward), and writing 
Qk = yWs(yW)t where 5 = diag{-ji,--- ,1n}, the 
expectation values of the m target observables at each 
MOTC step can then be written in terms of the experi- 
mentally estimated frequencies as 



N 

ckiPiii, 

i=l 



where the Cki are the expansion coefficients in ([29|) . This 
approach is generally more efficient than direct estima- 
tion of the (Ofe). 



D. Gradient estimation 

In an experimental setting, the gradients 37^^ niay 
be estimated by statistical sampling of points near the 
current control field £s [t) ^25] instead of the method of 
finite differences; the latter is inaccurate due to the in- 
evitable presence of noise (e.g., laser noise) in the lab- 
oratory. Let us denote a discretized representation of 
the control field £(t) by the vector x (e.g., the 128 pixel 
settings in a laser pulse shaper) with associated domain 
2?, the point at which the gradient is evaluated by xq, 
and a symmetric distribution function around over 2? 
by vr. 

It can be shown [23 that the following expression con- 
stitutes a second-order approximation to V$(a5'o): 

V<l>(fo) ~ j {x - xq)<^(x)tt{x - xo)df, 
Jv 

where the covariance matrix is 

E= /(f-fo)(x-xo)M-xo)df. 
Jv 

This expression can be evaluated via Monte Carlo inte- 
gration over iVexp measurement results: 



V$(fo) 



;(#') -fo)*(^^'^)^(2^^'^ -^o). 



Open-loop steepest ascent control field search using a 
femtosecond laser pulse shaper, with the gradient eval- 
uated at each iteration via this technique, successfully 
reached the global maximum of the control landscape for 
second harmonic generation [25[ , indicating a robustness 
of such gradient-based search algorithms to noise. 
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Fig. 2: MOTC-based Pareto optimum (P^) sampling for 
three observables in an 11-level system. Weakly Pareto 
optimal unitary matrices were determined using the kine- 
matic gradient (|16p followed by MOTC to identify optimal 
controls. The convex weights in equation (|3]) were set to 
Qi = 0.7, Q2 = 0.2, Q3 = 0.1 (dashed); qi = 0.2, 02 = 
0.7, Q3 = 0.1 (solid); qi = 0.1, 02 = 0.2, 03 = 0.7 (dotted). 
p{Q) was a thermal mixed state. A) Observable Oi; rank 
three diagonal degenerate observable with the three highest 
energy levels targeted; B) Observable O2 = |2)(2|; C) Ob- 
servable 63 = |3)(3|. 



VI. EXAMPLES 

As a representative Pareto optimal control scenario, 
we apply tracking control to the weighted maximization 
of three mutually commuting observables. In addition, 
in order to illustrate the characteristic features of prob- 
lems - such as optimal mixed state preparation - that 
require the control of m > 2N — 1 observables, we ap- 
ply MOTC using multiple mutually unbiased measure- 
ment bases to a multiobservable control problem involv- 
ing control of 40 observables in an 11-level system. 

The tracking examples below employ an 11- 
dimensional Hamiltonian of the form ([¥]), with 



Ha = 



diag {0.1,0.2,- •• ,1.1}, 

0.15, |z-j'| = l; 
0.08, |z-j|=2; 
0, otherwise. 



(34) 
(35) 
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Fig. 3: Optimal control fields A), B) (magnified) and their 
Fourier power spectra C) obtained by the MOTC Pareto 
optimum sampling depicted in Fig. [2] for weights ai = 
0.7, Q2 = 0.2, Q3 = 0.1 (dashed) and qi = 0.2, Q2 = 
0.7, 03 = 0.1 (solid). eo(i) was of the form p6|) with am- 
plitudes weighted according to the number of times the fre- 
quency uoij appears in the transition frequency spectrum. 



It can verified (by checking that the rank of the Lie al- 
gebra generated by and Hq , ji equals A^^), that this sys- 
tem is fully (propagator) controllable [ll, [111. MOTC 
and multiobservable gradient flow algorithms were im- 
plemented using the numerical methods described in 0] . 



A. Quantum control landscape Pareto front 
sampling 

A common objective in quantum multiobservable 
maximization problems is the identification of control 
fields that maximize a single observable expectation 
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value while selectively weighting the importance of in- 
creasing the expectation values of several auxiliary ob- 
servables. As discussed above, such control fields are 
weakly Pareto optimal. These weak Pareto optima can 
be efficiently sampled using the methods described in 
Sections III and IV. As an example of weighted multiob- 
servable maximization, several combinations of convex 
weights in multiobservable objective function ([3]) were 
employed in separate trials, representing desired trends 
in the relative magnitudes of the respective observable 
expectation values of three commuting observables; each 
observable was assigned the highest weight in several 
trials. The weakly Pareto optimal set corresponds in 
this case to points on e{t) where at least one of the 3 
observable expectation values is at its maximum pos- 
sible value; for the purposes of numerical illustration, 
~ 85% of the maximum achievable value was considered 
sufficient. The kinematic gradient (jl6p was applied to 
maximize the multiobservable objective function on the 
domain U{N)\ in the case of these observables, it always 
reached a weakly Pareto optimal point. The expecta- 
tion value of the observable assigned the highest weight 
reached the greatest value. 

MOTC (equation ([25]) ) was then applied to track mul- 
tiobservable expectation value paths ([30|) to controls in 
VI, that map to these points in (Fig. [2]). Fig. [3^ 
compares the optimal control fields obtained by MOTC 
and their Fourier power spectra at a point on the weak 
Pareto optimum where observable 1 dominates, with 
those at a corresponding point where observable 2 dom- 
inates. These points do not lie on the same level set 
of any observable. Nonetheless, note that their Fourier 
power spectra (Fig. 3B) are very similar, indicating that 
many similar fields reside at the distinct weakly Pareto 
optimal manifolds corresponding to different dominant 
observables. By contrast, optimal fields that lie on the 
level set of a single observable may display very differ- 
ent Fourier spectra, if different numbers m of observables 
are controlled (as shown in Fig. [6] below). This suggests 
that the major contributor to the complexity of optimal 
fields in Pareto optimal control problems is the number 
of parameters of the density matrix or U{T) that are si- 
multaneously constrained, and not the tradeoff between 
the expectation values of the various observables on the 
Pareto front. 

In order to assess the advantage of precomputing a 
propagator in and tracking to a corresponding point 
in VI, via MOTC, the multiobservable dynamical gradi- 
ent ([5]) was applied directly in an adaptive step size (line 
search) steepest ascent algorithm. Although the multi- 
observable gradient flow converged to a weak Pareto op- 
timum, the dynamical gradient flow was substantially 
less efficient in most weighted multiobservable maxi- 
mizations compared to single observable maximizations 
(data not shown) . Typically, none of the m observable 
expectation values increase monotonically in dynamical 
multiobservable steepest ascent, whereas the expecta- 
tion value of the most highly weighted observable does 



rise monotonically in the kinematic multiobservable gra- 
dient flow, presumably due to the lower dimensional- 
ity of the search domain. In an experimental setting 
where noise may obscure minute differences in observ- 
able expectation values, application of steepest ascent 
may therefore not be as efficient as it has been found 
to be for the control of single observables ^25*1. By con- 
trast, kinematic sampling on U{N) (after estimation of 

the state p(0)) is a reliable and rapid method of sampling 
-pu 

' w • 

In the above example, the observables were mutually 
commuting, i.e. belonged to the same measurement ba- 
sis. In this case, the kinematic gradient flow could 
be used to locate weakly Pareto dominant points, since 
the maximum submanifold Ai'^^^ of <l>j\/ intersected the 
maximum Ai™'^^ of one of the single observable cost 
functions (here, the cost functions satisfied the con- 
ditions in Lemma 1 guaranteeing Pareto convergence). 
In the more general case where the targets are m-tuples 
of arbitrary observable expectation values, or points on 
the Pareto front where specific expectation values of the 
m — 1 dominated observables are desired, the von Neu- 
mann entropy (|20p can be maximized, instead of em- 
ploying the kinematic gradient flow. 

B. Effect of state preparation 

The MOTC Gramian matrix F must be invertible at 
a given control fleld es{t) in order for the tracking al- 
gorithm to be able to follow any arbitrarily designated 
path to the Pareto front V^ . Typically, a Gramian con- 
dition number C > 10® results in large numerical errors 
upon inversion, and would be expected to compromise 
the accuracy of tracking, due to the sparseness of con- 
trol field increments de{t) that are capable of driving the 
system to the arbitrary neighboring states. 

As the number of controlled observables m increases, 
the condition number C can rise steeply beyond a criti- 
cal m, where additional observables cannot be assigned 
arbitrary local tracks. Analytically, when the condi- 
tion is satisfied, the m functions of time 
that dictate local multiobservable controllability can- 
not remain linearly independent, resulting in an ill- 
conditioned Gramian F. This feature of multiobservable 
local controllability is independent of the Hamiltonian. 
However, in practice, condition numbers typically do not 
explode at a critical m, due to numerical inaccuracies in 
the singular value decomposition of matrices that are 
close to singular. 

Fig. |4] compares the F condition number distributions 
for a system with pure p(0), for mi = 20 and m2 = 40, 
using randomly sampled fields e{t) of the form 

N N 

e{t) = X! X! ^"^^ ('^^J* + '/'ij) . < i < T, (36) 
where ujij = \Ei ~ Ej\ denote the transition frequencies 
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Fig. 4: Gramian matrix condition number distributions for an 11-level system (random internal Hamiltonian Ho and dipole 
operator jj,): multiple measurement bases and the effect of state preparation. The amplitudes and phases of the modes of 
the control field e{t) were sampled randomly from the uniform distributions (0,1] and (0,27r], respectively, with the mode 
frequencies tuned to the transition frequencies of the system. A) MOTC, 20 observables: pure state p(0). B) MOTC, 40 
observables: pure state p(0). Observables were drawn sequentially from successive mutually unbiased measurement bases in 
equation (|32}. 



between energy levels Ei, Ej of Hq, (pij denotes a phase 
sampled uniformly within the range (0, 27r], and Aij de- 
notes a mode amplitude sampled uniformly within the 
range (0,1]. The final time T was chosen to be suffi- 
ciently large to achieve full controllability over U{N) at 
t = T 0, 13 ■ The observed trends are consistent with 
theoretical predictions since mi = 2N — 2 < 2N — 1, 
whereas rn2 = 4iV - 4 > 2A^ - 1. 

An increase was observed in the mean condition num- 
bers encountered for large m MOTC algorithms starting 
from the field modes of e{t) tuned to the transition fre- 
quencies of the system, compared to e{t) modes tuned 
to sample frequencies within roughly the same range 
but not precisely tuned to the transition frequencies 
(data not shown; fields of the form (|36p with frequencies 
ujj = 0.1 j {En - Ei)/h, I < j < 50 were used). Thus 
careful tuning of the initial guess for the control may 
facilitate implementation of MOTC algorithms for large 
m. 



C. MOTC with multiple measurement bases 

Greater numbers of orthogonal observables can be ef- 
fectively tracked for state preparation p{0) of higher 
rank. A full set of observables from multiple measure- 
ment bases (or more generally, m < 2N'n—n^ orthogonal 
observables) can be effectively tracked for state prepa- 
ration of p{0) with rank n, according to equation 

Figs. IDA. and B indicate that for m = 40, the F con- 
dition numbers for thermal p(0) are significantly lower 



than those for pure p(0). It should be possible to track 
40 observables without significant numerical errors oc- 
curring due to singularity of the MOTC Gramian. 40 ob- 
servables can be tracked through measurements in four 
bases. For this purpose, we employed the r — 0,1,2,3 
MUB bases given by equation ([5^ . The F condition 
numbers and associated tracking errors are presented in 
Fig. [SJ'\.,C (solid trace). Despite uniformly higher con- 
dition numbers, singularity is not a limiting factor when 
p(0) is thermal; the error correction methods described 
in section fVl suffice to maintain the desired optimization 
trajectory. 

For p{0) being a pure state, the number of observables 
that can be simultaneously tracked along arbitrary paths 
while avoiding F-matrix singularities is lowest. Accord- 
ing to Fig. m the F-matrix for tracking 40 observables 
for p(0) being pure should be effectively singular. This 
leads to unacceptably large tracking errors and devia- 
tion from the expected path, as shown in Fig. [Sp. In- 
deed, based on the analytical relation ([TO)) , it should 
not be possible to track a full set of observables from 
more than two measurement bases when p(0) is pure. In 
these cases, search algorithms based on scalar objective 
optimization, such as the gradient, may be preferred. 
Nonetheless, as described in Section III, gradient search 
will generally not effectively locate points on the Pareto 
front for arbitrary sets of multiple observables. 

The evolution of control fields along the optimization 
trajectory for increasing numbers of observables sheds 
light on the impact of multiple observable objectives on 
the control landscape. Fig. [H] compares the control field 
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Fig. 5: Comparison of MOTC of a set of 40 observables from 
four measurement bases for a full-rank nondegenerate mixed 
initial state p(0) and for a pure initial state; the lowest eigen- 
state El, of Hq). A) F matrix condition numbers for 
thermal p(0); B) F matrix condition numbers for pure p(0); 
C) Associated tracking errors (solid — thermal, dashed = 
pure). 



Fig. 6: Evolution of the control field during MOTC with dif- 
ferent numbers of measurement bases (system (|34|l l. Top: 5 
observables (1 measurement basis); Bottom: 15 observables 
(2 measurement bases). eo(t) was of the form (|36|) : p(0) was 
a full-rank, nondegenerate mixed state. 



evolutions for m — 5 and m = 15 MOTC, with th 
track Ws given by ((30|) between Uq and a target prop 
agator lying on the global maximum manifold A^™'* 
for the first observable. At each step, the two fields n 
side on the same level set of the first observable, bu 
generally deviate increasingly from each other on th 
domain U{N). The imposition of additional observabl 
constraints delimits each successive level set such tha 
it contains a smaller set of more complex fields. Fi^ 
[7] isolates and compares the optimal fields obtained b 
m = 5 and m = 20 MOTC. The power spectra of th 
optimal control fields for m = 5, m = 15 and m = 20 ar 
shown in Fig. [H] Note that although these fields lie o 
the same level set of a single observable, they are mor 
diverse than those in Fig. [31 which lie on distinct sut 
manifolds of the Pareto front . Since control of mor 
orthogonal observables requires control over more inde- 
pendent parameters of the unitary propagator U{T), the 
latter problem imposes more constraints on the permis- 
sible dynamics over < t < T. 

Note that all the MOTC variants in this section (con- 
trol of up to 40 observables) can be implemented through 
measurements of four or fewer full-rank, nondegenerate 
observables in mutually unbiased measurement bases. 
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Fig. 7: Superimposition of the optimal control fields iden- 
tified by MOTC for 5 (dashed) and 20 (solid) observable 
control problems. £o(^) was of the form (|36p : p(0) was a full- 
rank, nondegenerate mixed state. Observables were drawn 
successively from mutually unbiased measurement bases as 
described in equation (|32p . 
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Fig. 8; Fourier power spectra of optimal control fields ob- 
tained by MOTC tracking for various numbers of observables 
m. The spectra for m — 5 (dashed) and m — 20 (solid), cor- 
responding to the fields in Fig. [3 are shown alongside the 
spectrum for m — 15 (dotted). 



VII. DISCUSSION 

To date, the majority of quantum optimal control ex- 
periments (OCE) have been implemented with adap- 
tive (e.g., genetic) search algorithms, which require only 
measurement of the expectation value of an observable 
at the final dynamical time for each successive control 
field. Recently, gradient flow algorithms have been im- 
plemented in OCE studies, based on the finding that 
quantum control landscapes are devoid of suboptimal 
traps [2^, and the observation in computer simulations 
that gradient-based algorithms converge more efficiently 
than genetic algorithms. Similar conclusions may be 
drawn about search algorithms for quantum multiob- 
servable control. Whereas in generic multiobjective op- 
timization problems, the existence of local traps typi- 
cally requires the use of stochastic methods, the inher- 
ent monotonicity of quantum control landscapes enables 
more efficient deterministic algorithms for multiobjec- 
tive quantum control. Quantum multiobservable track- 
ing control (MOTC) is an example of such an algorithm. 

Efficient control algorithms are especially important 
in the context of quantum Pareto optimization, since the 
tradeoff between observable objectives in such control 
problems renders the problem of identifying optimal con- 
trol fields on the Pareto front via stochastic algorithms 
much more expensive than the analogous problem of sin- 
gle observable control. Moreover, in quantum Pareto 
optimization, it is generally nontrivial to parameterize 



a family of scalar objective functions whose maxima are 
all (weak) Pareto optima, as demonstrated in section lllll 
Promising successes have been reported in the applica- 
tion of multiobjective evolutionary algorithms (MOEA), 
which avoid this obstacle, to quantum Pareto optimal 
control problems involving a limited number of quan- 
tum observables m 0. However, the convergence time 
of such algorithms scales unfavorably with system size 
due to the computational expense of sorting nondomi- 
nated solutions By contrast, apart from the over- 
head costs of initial state estimation and online measure- 
ments of observable expectation values, the convergence 
time of MOTC does not scale with the number of ob- 
servables optimized. 

In quantum multiobservable maximization problems, 
it is particularly efficient to first identify points on the 
kinematic Pareto front V^, of the multiobservable opti- 
mization problem, which requires knowledge or estima- 
tion of the initial density matrix p{0) of the quantum sys- 
tem. In the case that p(0) is not a thermal mixed state, 
it can be determined by the method of maximal likeli- 
hood estimation (MLE) of quantum states [l|. Once p(0) 
is known, the Pareto optima on U{N) can be sampled 
efficiently using the methods described in Section IIIII 
Observable tracking can then be applied in open loop 
quantum control experiments to locate optimal control 
fields e*{t) on the dynamic Pareto front V^. The local 
vicinity of each £*(t) can subsequently be explored to 
identify controls with the most favorable dynamic prop- 
erties. 

Experimental observable expectation value tracking 
requires an extension of gradient flow methods that have 
already been applied in the laboratory [2^. Specifi- 
cally, instead of following the path of steepest ascent, the 
control field is updated in a direction that in the first- 
order approximation would produce the next observ- 
able track value. In principle, the ability to follow the 
path of steepest ascent for single observable maximiza- 
tion should be extensible to following arbitrary trajec- 
tories across multiobservable control landscapes. When 
applied in conjunction with signal averaging, statistical 
estimation of the gradient - employing as few 30 sam- 
pled points as per iteration - has been found to confer 
adequate accuracy and robustness to laser noise in single 
observable gradient fiow algorithms. In practice, experi- 
mental multiobservable tracking may demand additional 
robustness to noise, due to the smaller number of con- 
trol fields that are consistent with any given point along 
the track. Experimental error correction, described in 
Section IV and applied numerically in Section VI, may 
therefore be essential for these applications. Our goal 
in this paper has been to lay the theoretical foundations 
for quantum Pareto optimal control and to illustrate its 
implementation under ideal conditions. Numerical stud- 
ies that simulate the effects of control field noise [13] on 
MOTC will be reported in a separate work. 

The eigenvalue spectrum of the initial density ma- 
trix p(0) affects the combinations of observable expecta- 
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tion values that can be simultaneously achieved and the 
nonsingularity of the multiobservable control Gramian. 
Tracking a vector $(s) of m orthogonal observable ex- 
pectation values rarely encounters singularities originat- 
ing in lack of local controllability, if m <IC N"^ — 1. 
However, for the greater number of observables that 
must be controlled in complex Pareto optimization or 
state preparation problems, the MOTC Gramian is more 
prone to becoming ill-conditioned. This effect can be 
alleviated by carefully tuning the initial guess for the 
control field. 

The additional overhead required for multiobservable 
tracking, relative to stochastic algorithms, can be miti- 
gated by the development of efhcient estimators of mul- 
tiobservable control gradients. We have presented sim- 
ple measurement techniques, including the use of mutu- 
ally unbiased measurement bases, that can dramatically 
improve MOTC efficiency for large numbers of observ- 
ables. Still, the methodology for gradient estimation 
described in section IIVI does not exploit the statistical 
correlation of components of the multiobservable gra- 
dient ([5]), which originates in the correlation between 
elements of /o(0). Future work should investigate the 
application of MLE to generate more accurate estimates 
of the multiobservable gradient for an equivalent number 
of observable measurements. In addition, it is important 
to examine the impact of quantum decoherence - which 
produces nonunitary dynamics described by Kraus op- 
erators - on the efficiency of multiobservable tracking. 



APPENDIX A: CHARACTERIZATION OF 
PARETO OPTIMAL SUBMANIFOLDS 

In order to qualitatively assess the likelihood of the 
gradient flow ^ converging to a weak Pareto optimum 
where the expectation value of observable k is maxi- 
mized, the dimensions of Z^^^'"'''' = TW^'^'^ f] -A^m 
and jM™/"^ may be compared. If the two dimensions are 
equal, there is a finite chance of the gradient flow con- 



verging to a Pareto optimum. Otherwise, although the 
gradient flow may converge arbitrarily close to a Pareto 
optimum, the attractor of the flow will not be in vH/'^^ 
and the likelihood of approaching within a fixed radius of 
a Pareto optimal point will generally be diminished. In 
this Appendix we present a few results regarding the in- 
tersections of the critical submanifolds of the observable 
objective functions ([!]), to facilitate such an analysis. As 
a special case, we also derive the conditions guaranteeing 
convergence to a weak Pareto optimum given in Lemma 
1. 

Denote by 9a the diagonal form of Oa in 
a basis Sa where its eigenvalues are arranged 
in increasing order; i.e., 9q = S^QaSa = 
diag{-fa(i),- ■ ■ ,7a(i);-- - ;7a(s„),--- >7a(s„)}, ac- 
cording to the convention in Section III. Set 
the diagonal matrix 8b = S^QbSa — S\Sb ■ 
c^«a.9{7f,(i)''" 'T6(i);--- ;7Kst,)'-" '76(s,)} • SlSa- 
Let U{vn.'j^) = SlSbU{mi,)SlSa, where W(mb) is defined 
as in Section III. 

We first obtain an upper bound on dim T^'^- any 
observable O^, it was shown in [T3| that the dimension 
of the critical manifold of is given by 

dim A^L = ^ "2 -I- g (^1) 

x—l y— 1 x—l y—l 

where the Vxy are overlap numbers that denote the num- 
ber of positions where the eigenvalues 702; and jay ap- 
pear simultaneously in p(0) and O^, and the rix, ruky 
are defined as in Section III. Then, we have dim Z^-'^ < 

min |dim M^, dim A^^| for the upper bound. 

Further analytical characterization of X^J^ = 

M^f^Ml, including a lower bound on its dimension, 
is possible when the two observables Qa and Ob com- 
mute. According to equation pUj) we have M^^f] Ml = 
iY(n)7r*i^(ma)nW(n)7r^'i^(m'J for any £ Pj,, S 
. This can be expressed 



y Z^(nV(W(m,)nW(mJ,))l |J 



u 



where = Vl. The first term in this ex- 
pression can be written {UttV \ U G W(n); V G 
W(ma) n Z//(m[,); tt e Pp}- Denote the Lie subgroup 
W(ma) n Vl{m'^) oiU{N) by W(P). Then, we have 

U U{n)^U{V) C 2^,. (A2) 



Let us partition the set Pp with the equivalence rela- 
tion ^ defined by tti ^ 7r2 if tt^ = Utt^V, for (J7, V) S 
U{n) X U{V). Then it is straightforward to show that 
tt's in different equivalence classes with respect to ~ 
label disconnected submanifolds of Urreri'^ W(n)7ri^(P). 
^Y(P) can be written U{pii) x •■• x U{ps^st)j where 
the Pxy are overlap numbers that denote the number 
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of positions where the eigenvalues ^ax and 7;,j, appear 
simultaneously in Qa and Q'j^. Together, these over- 
lap numbers form a contingency table 14], with rows 
indexed by x and columns indexed by y. The row 
sums are X)y=iP^j( ~ ^^"^ column sums are 

Based on these results we can now derive conditions 
guaranteeing convergence of ([6]) to a weak Pareto opti- 
mum. We have C if there is just one nonzero 
entry in each column of the contingency table (such 
that each S'lSh U{mi,y) SlSa ^ Uiniax) for just one 
X, hence U{P) = ^(m't,)), and C (such that 
ViCWl =Vl); then, Tj,^ = Mi- In this case we have 

the first equality in equation (|A2p and all tt e belong 
to the same equivalence class with respect to ^. If we 
set Qa = 6fe and Of, = Om, this proves the conditions 



guaranteeing convergence to Vw ' in Lemma 1. ■ 

We now obtain a lower bound on dim I^^^^ in the case 
of commuting Qa, Qb- Define Ft^{P,Q) : tt — > PttQ, 
where (P, Q) G W(n) x W(P), and let stab (F^) de- 
note the set of all {U, V) e U{n) x U{P) such that 
F^{U,V) = UttV — TT. It is straightforward to verify 
[11] that stab (F^) = U{n) f] ^rt U{V) tt. The lat- 
ter may be written W(Q) = W(giii) x • • • x l/l{qrs^st), 
where the overlap numbers qxyz denote the number of 
positions where the eigenvalue appears simultane- 
ously with Jay and after the permutation tt has 
been applied to Qa, Q'fj- It was shown in [l3| that 
any manifold of the type W(n) tt W(P) has dimension 
dim U{n) + dim ^^(P) — dim stab{FT^). Given that the 
dimension of U{N) is N"^, the lower bound on the di- 
mension of the intersection submanifold is thus 



dim rj, > max ] ^ + E E^^'v " E E E ' (^3) 

'"^'^n {x=l x = ly=l x = ly=lz = l ) 



In the case that the observables commute, we there- 
fore have both analytical lower and upper bounds on 
dim Z™^^'™'''^, whereas if they do not commute, we have 
only the upper bound. If the upper bound is less than 
dim A^™/'^, then the chance of the flow converging to 



Vw is infinitesimally small (although it may approach 
arbitrarily close). The lower bound is most useful when 
it equals dim A^™/'^, in which case there is a finite 

chance that the flow will converge 
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